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We consider a quantum polynomial-time algorithm which solves the discrete logarithm 
' problem for points on elliptic curves over GF{2"^). We improve over earlier algorithms 

^ by constructing an efficient circuit for multiplying elements of binary finite fields and 

by representing elliptic curve points using a technique based on projective coordinates. 
The depth of our proposed implementation, executable in the Linear Nearest Neighbor 
1 J ■ (LNN) architecture, is O(m^), which is an improvement over the previous bound of 

r> ' 0{m^) derived assuming no architectural restrictions. 
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1 Introduction 

Quantum computing [14] has the ability to solve problems whose best classical solutions are 
considered inefficient. Perhaps the best-known example is Shor's polynomial-time integer 
factorization algorithm [19], where the best known classical technique, the General Number 
Field Sieve, has superpolynomial complexity exp 0{\/ n log^ n) in the number of bits n [22] . 



"An earlier version of this paper has been presented at the 3rd Workshop on Theory of Quantum Computation, 
Communication, and Cryptography, Tokyo, Japan, January 30 - February 1, 2008. 



1 



2 On the Design and Optimization of a Quantum Polynomial- Time Attack on Elliptic Curve Cryptography . . . 
Table 1. Comparing hardware performance of RSA-3072 and ECC-256 | 21| . 



Mode 


RSA-3072 


ECC-256 


Space-optimized 


184 ms 
50,000 gates 


29 ms 
6,660 gates 


Speed-optimized 


110 ms 
189,200 gates 


1.3 ms 
80,100 gates 



Since a hardware implementation of this algorithm on a suitable quantum mechanical system 
could be used to crack the RSA cryptosystem [22], these results force researchers to rethink 
the assumptions of classical cryptography and to consider optimized circuits for the two 
main parts of Shor's factorization algorithm: the quantum Fourier transform [141 |4] and 
modular exponentiation [13j . Quantum noise and issues of scalability in quantum information 
processing proposals require circuit designers to consider optimization carefully. 

Since the complexity of breaking RSA is subexponential, cryptosystems such as Elliptic 
Curve Cryptography (ECC) have become increasingly popular. The best known classical 
attack on ECC requires an exponential search with complexity 0(2"/^). The difference is 
substantial: a 256-bit ECC key requires the same effort to break as a 3072-bit RSA key. The 
largest publicly broken ECC system has a key length of 109 bits [3], while the key lengths of 
1024 bits and higher are strongly recommended for RSA. However, the key lengths represent 
only the communication cost of a cryptographic protocol. It might, in principle, be possible 
that the hardware implementation of ECC were overwhelmingly less efficient, and then its 
practical efficiency would be undermined. However, this is not the case; indeed, the situation 
is rather opposite. Table [T] compares the efficiency of CMOS circuits (with the same clock 
speed) implementing 3072-bit RSA and 256-bit ECC, which both give equivalent security of 
128 bits, in two hardware modes: optimized for space (cost), and speed (runtime). It is clear 
that the ECC implementation is more efficient. Relative efficiency of ECC as compared to 
RSA has been widely recognized. For instance, ECC has been acknowledged by National 
Security Agency as a secure protocol and included in their Suite B |15) . 

Most ECC implementations are built over GF(2"'), likely, due to the efficiency of relevant 
hardware and the ease of mapping a key into a binary register. Software implementations, 
such as ECC over GF{2^^^), are publicly available [1], making ECC ready to use for any 
interested party. 

There exists a quantum polynomial-time algorithm that solves Elliptic Curve Discrete 
Logarithm Problem (ECDLP) and thus cracks elliptic curve cryptography [1^. As with 
Shor's factorization algorithm, this algorithm should be studied in detail by anyone interested 
in studying the threat posed by quantum computing to modern cryptography. The quantum 
algorithm for solving discrete logarithm problems in cyclic groups such as the one used in 
ECC requires computing sums and products of finite field elements, such as GF{2"^) [B]. 
Addition in GF{2"'-) requires only a depth-1 circuit consisting of parallel CNOT gates [5]. 
We present a depth 0{m) multiplication circuit for GF{2™) optimized for the Linear Nearest 
Neighbor (LNN) architecture. Our circuit is based on the construction by Mastrovito [TUj . 
Previously, a depth 0{m?) circuit in an unrestricted architecture was found in P]. With the 
use of our multiplication circuit the depth of the quantum discrete logarithm algorithm over 
the points on elliptic curves over Gi^(2™) drops from 0{nv') to 0{m?). Our implementation 
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is optimized for LNN, unlike previously constructed circuit whose depth, if restricted to LNN, 
may become as high as O(m^). 

The paper is organized as follows. In Section [2] we give an overview of quantum computa- 
tion, GF(2™) field arithmetic, and elliptic curve arithmetic. Section [3] outlines the quantum 
algorithm, and presents our improvements: the Gi^(2™) multiplication circuit, and projective 
coordinate representation. The paper concludes with some observations and suggestions for 
further research. 

2 Preliminaries 

We will be working in the quantum circuit model, where data is stored in qubits and unitary 
operations are applied to various qubits at discrete time steps as quantum gates. We assume 
that any set of non-intersecting gates may be applied within one time step. The total number 
of time steps required to execute an algorithm as a circuit is the depth. Further details on 
quantum computation in the circuit model can be found in p3) . 

We will make use of the CNOT, Toffoli and SWAP gates. The CNOT gate is defined 
as the unitary operator which performs the transformation \a) \b) ^-> \a) \a®b). The Toffoli 
gate [20] can be described as a controlled CNOT gate, and performs the transformation over 
the computational basis given by the formula \a) \h) \c) i— > \a) \b) \c®ah). Finally, SWAP 
interchanges the values of the qubits, i.e., performs operation \a) \h) i— > \h) \a). 

2.1 Binary Field Arithmetic 

The finite field GF{2™) consists of a set of 2™ elements with addition and multiplication 
operations, and additive and multiplicative identities and 1, respectively. GF{2™) forms a 
commutative ring over these two operations where each non-zero element has a multiplicative 
inverse. The finite field GF{2"^) is unique up to isomorphism. 

We can represent the elements of GF{2™) where m > 2 with the help of an irreducible 
primitive polynomial of the form P{x) ~ X^i^lo ^^^^ where Ci G GF{2) |16| . The finite 

field GF{2™) is isomorphic to the set of polynomials over GF{2) modulo P{x). In other 
words, elements of GF{2™') can be represented as polynomials over GF{2) of degree at most 
m — 1, where the product of two elements is the product of their polynomial representations, 
reduced modulo P{x) [16l HSj. As the sum of two polynomials is simply the bitwise XOR 
of the coefficients, it is convenient to express these polynomials as bit vectors of length m. 
Additional properties of finite fields can be found in [16]. 

Mastrovito has proposed an algorithm along with a classical circuit implementation for 
polynomial basis (PB) multiplication [10l[TT], popularly known as the Mastrovito multiplier. 
Based on Mastrovito's algorithm, [T^ presents a formulation of PB multiplication and a gen- 
eralized parallel-bit hardware architecture for special types of primitive polynomials, namely 
trinomials, equally spaced polynomials (ESPs), and two classes of pentanomials. 

Consider the inputs a and 6, with a = [oq, ai, 02, . . . , a„j_i]^ and b = [60, ^i, &2, • ■ • , bm-iY , 
where the coordinates and hi, < i < m, are the coefficients of two polynomials A{x) and 
B(x) representing representing two elements of GF{2"^) with respect to a primitive polynomial 
P{x). We use the following three matrices: 

1. an m X (m — 1) reduction matrix M, 

2. an m X m lower triangular matrix L, and 
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3. an (m — 1) x m upper triangular matrix U . 
Vectors d and e are defined as: 



d = 
e — 



where L and U are defined as 



ao 
ai 





ao 



^m — 3 



ao 
ai 








ao 



Lb 



(1) 
(2) 



U = 



a,„_i 



ai 

02 



Om-l a-m-2 

a,„_i 



Note that d and e correspond to polynomials D{x) and i?(a::) such that A{x)B{x) — D{x) + 
x™'E{x). Using P{x), we may construct a matrix Af which converts the coefficients of any 
polynomial x™E(x) to the coefficients of an equivalent polynomial modulo P{x) with degree 
less than m. Thus, the vector 

c = d + Qe (3) 

gives the coefficients of the polynomial representing the product of a and b. The construction 
of the matrix M, which is dependent on the primitive polynomial P{x), is given in [18] . 

2.2 Elliptic Curve Groups 

In the most general case, we define an elliptic curve over a field F as the set of points 
{x,y) G F X F which satisfy the equation 

+ aixy + a-^y ^ x^ + a2X^ + a^x + a^. 

By extending this curve to the projective plane, we may include the point at infinity O as an 
additional solution. By defining a suitable addition operation, we may interpret the points of 
an elliptic curve as an Abelian group, with O as the identity element. 

In the specific case of the finite field GF{2™), it is possible to reduce the degrees of 
freedom in the coefficients defining the elliptic curve by the use of linear transformations on 
the variables x and y. In addition, it was shown in [T^] that for a class of elliptic curves called 
supersingular curves, it is possible to reduce the discrete logarithm problem for the elliptic 
curve group to a discrete logarithm problem over a finite field in such a way that makes such 
curves unsuitable for cryptography. For GF{2™), these correspond to elliptic curves with 
parameter ai = 0. We will restrict our attention to non-supersingular curves over G'-F(2™), 
which are of the form y'^ + xy = x^ + ax^ + b, where b 0. 

The set of points over an elliptic curve also forms an Abelian group with O as the identity 
element. For a non-supersingular curve over GF(2'"), the group operation is defined in the 
following manner. Given a point P = {xi,yi) on the curve, we define (— P) as {xi^xi + t/i). 
Given a second point Q — {x2,y2), where P ^ ±Q, we define the sum P + Q as the point 
{x3,y3) where x^ = + A + a;i + a;2 +a and j/3 = {xi+ X3)\ + x^ + yi, with A = ' When 



P = Q, we define 2P as the point {x^, ys) where X3 = A + A + a and ^3 



Aa;^ 



2^3, 
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Fig. 1. Circuit for GF(2'') multiplier with P{x) = x"* + a; + 1. 



with A = xi + Also, any group operation involving O simply conforms to the properties 
of a group identity element. Finally, scalar multiplication by an integer can be easily defined 
in terms of repeated addition or subtraction. 

The Elliptic Curve Discrete Logarithm problem (ECDLP) is defined as the problem of 
retrieving a constant scalar d given that Q = dP for known points P and Q. With this 
definition, we may define cryptographic protocols such as Difhe-Hellman or digital signature, 
using the ECDLP by modifying analogous protocols using the discrete logarithm problem 
over finite fields. 

3 Quantum Polynomial-Time Attack 

With a reversible implementation for the basic elliptic curve group operations, it is possible 
to solve the ECDLP with a polynomial-depth quantum circuit. Given a base point P and 
some scalar multiple Q = dP on an elliptic curve over GF(2"^), Shor's algorithm for discrete 
logarithms [T9] constructs the state 

^ 2'"-12'"-l 

^ E E \^)\y)\^P + yQ), 

then performs a two-dimensional quantum Fourier transform over the first two registers. It 
was shown in 17 that the creation of the above state can be reduced to adding a classically 
known point to a superposition of points, by using a "double and add" method analogous to 
the square and multiply method of modular exponentiation. Points of the form and 2*^(5 
can be classically precomputed, and then, starting with the additive identity, group addition 
operations can be performed, controlled by the appropriate bits from \x) or \y). Note that all 
of the intermediate sums must be preserved until the computation is completed before they 
can be uncomputed. 

3.1 Linear Depth Circuit for GF{2™') Multiplication in the LNN Architecture 

We now discuss how to implement multiplication over GF{2"^) as a quantum circuit. Firstly, 
using equations ([TH3|) , derive expressions for d, e and c. We next perform the following steps 



6 On the Design and Optimization of a Quantum Polynomial- Time Attack on Elliptic Curve Cryptography 



breaking the entire computation into three distinct stages/circuits: 

1. Compute e in an anciUary register of m qubits. 

2. Transform e into A/e, using a hnear reversible implementation. 

3. Compute and add d to the register occupied by Me. 

We illustrate the above steps with an example using P{x) — + x + 1. Expressions for d 
and e derived from equations (HHH) are shown below. 



d = 



aobo 
ai&o + ao^i 
02^0 + CLibi + aob2 
a^bo + 0261 + 0162 + 00^3 



asbi + 0262 + aibs 
ash + 02^3 
0363 



We also construct the matrix M 



1 

1 1 

1 1 

1 



From ^ , we compute the multiplier output c — d + Me 



do + eo 
di + ei + eo 

^2+61+ 62 
^3 + 62 



1. We first compute eg, ei, and 62 in the ancilla, as shown in Figure [T] (gates 1-6). 

2. We next implement the matrix transformation Me (gates 7-9). 

3. Finally, we compute the coefficients di, < i < m, and add them to the ancilla to 
compute c (gates 10-19). 



At this point, we have a classical reversible circuit which implements the transformation 
a) \b) |0) i—^ \a) \b) \a ■ b). However, if we input a superposition of field elements, then the 
output register will be entangled with the input. If one of the inputs, such as \b) is classically 
known, then we may also obtain |&^^) classically. Since we may construct a circuit which 
maps \a ■ b) \b~^') |0) 1— s- |a • 6) |&~^) |a), we may apply the inverse of this circuit to the output 
of the first circuit to obtain \a) \b) \a ■ b) 1-^ |0) \b) \a ■ b) using an ancilla set to \b~^). This 
gives us a quantum circuit which takes a quantum input |o) and classical input and 
outputs \a ■ b) \b). When \b) is not a classical input, the output of the circuit may remain 
entangled with the input, and other techniques may be required to remove this entanglement. 
However, we emphasize that this is not required for a polynomial-time quantum algorithm 
for the ECDLP [17]. 

In some circumstances, we may derive exact expressions for the number of gates required 
in the GF multiplication circuit. 

Lemma 1 A binary field multiplier for primitive polynomial P{x) can be designed using at 
most 2m^ — 1 gates. If P{x) is a trinomial or the all-one polynomial, where each coefficient 
is 1, we require only iv? + to — 1 gates. 
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Proof. There are three phases to the computation: computing e, computing M e, and 



gates respectively, for a total of gates. Next, consider the implementation of the transfor- 
mation M. 

In general, — 1 CNOT gates suffice for any linear reversible computation defined by 
the matrix AI in equation ([3]) ^23J. This gives a general upper bound of 2ni^ — 1 gates. In the 
specific case of the All-One-Polynomial, the operation M consists of adding ei to each of the 
other qubits, requiring m — 1 CNOT operations. This gives a total of -f m — 1 operations. 

For a trinomial, we have a primitive polynomial P{x) — x™ -I- x'^' + 1 for some constant k 
such that 1 < k < m. To upper bound the number of gates required to implement M, we may 
consider the inverse operation, in which we begin with a polynomial of degree at most to — 1, 
and we wish to find an equivalent polynomial where each term has degree between m — 1 and 
2to — 2. Increasing the minimum degree of a polynomial requires one CNOT operation, and 
this must be done m — 1 times. Again, this gives a total oi m? + m — \ operations. QED 

3.1.1 Parallelization 

We construct a parallelized version of this network by considering the three parts of the 
computation: computation of e, multiplication by M and in-place computation of d. For e 
and d, note that given coefficients and hj where the value oi i — j is fixed, the target qubit 
of each separate term aibj is different. This means that they may be performed in parallel. 
In the case of e, we evaluate aibj whenever i + j > m. This means that the values oi i — j 
may range from —(to — 2) to m — 2, giving a depth 2to — 3 circuit for finding e. Similarly, for 
d, we evaluate aibj whenever i + j < m. The values oi i — j range from — (m — 1) to m — 1, 
giving a depth 2m— 1 circuit. Evaluation of d in the case of GF{2^) is illustrated in Figure[2] 
In [9], it is shown that every linear computation, such as that of the product Me, can 
be done in a linear number of stages, with a depth of at most 5to. Thus, a total depth of 
(2to — 3) -I- 5to -I- (2to — 1) = 9to -I- 0(1) suffices to implement the multiplication circuit. As 
such, an implementation which replaces the Toffoli gate with 2-qubit gates ([Hj, page 182) 
can be done by a circuit with the depth upper bounded by the expression 25to + 0{1). 

3.1.2 Execution of the multiplication circuit in LNN 

In this subsection we explain how to execute the GF multiplication circuit in linear depth 
0{m) in the LNN architecture. Our circuit consists of three distinct stages: creation of e, 
followed by a linear reversible transformation, and the in-place calculation of d. As shown 
in [^, the middle part of this calculation can be executed as a depth 5to computation in 
the LNN architecture. We next show that first and third parts in our construction can also 
be modified to become a linear depth computation in the LNN. At this point, we note that 
both subcircuits share identical structure, and as such we only need to consider either one. 
We choose the circuit for computing d. In the following, we will use its parallelized version 
described in the previous subsection and separate every two computational stages by a depth- 



adding d to the result. For e and d, each pair of coefficients which are multiplied and then 
added to another qubit requires one Toffoli gate. This requires 
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Fig. 2. Subcircuit computing d illustrated in the case of multiplication in GF{2^). L* indicate 
which stage does the given gate gets executed at. The qubit line turns gray after a given qubit 
was used last during the computation. 



2 qubit swapping stage such as to make it possible to execute the entire computation in the 
LNN in hnear depth. 

First, prepare the qubits in the LNN connectivity pattern cq — ci — ... — Cm-i ~ o-m-i ^bg — 
am-2 — bi — — ao — bm-i- For that, at most hnear depth qubit swapping stage is required, no 
matter what is the starting connectivity pattern. Next, execute a computational stage. For 
every Toffoli gate TOF{ci; aj, bk) applied and a qubit x on the left from q in the present LNN 
connectivity pattern use the depth-2 swapping stage SWAP(ci, aj) SWAP(x, aj) SWAP(ci, bk) 
to prepare the qubits for the next computational stage (LNN connectivity pattern x — Ci — 
aj — bk gets transformed to aj — x — bk — Ci). The workings of such adaptation to the 
LNN architecture are illustrated in Figures [2] and [3l Note that the number of swapping 
stages is no more than twice the number of the computational stages. Therefore, the entire 
computation can be executed in linear depth, not exceeding 34to + 0(1) (counting 1- and 
2-qubit operations), in the LNN architecture. 

3.2 Projective Representation 

When points on an elliptic curve are represented as ajfine coordinates (x, y), performing group 
operations on such points requires finding the multiplicative inverse of elements of GF{2'^). 
This operation takes much longer to perform than the other field operations required, and it 
is desirable to minimize the number of division operations. For example, [7] gives a quantum 
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L1: 
L2: 
L3: 
L4: 
L5: 
L6: 
L7: 
L8: 
L9: 
L10: 
L11: 
L12: 
L13: 
L14: 
L15: 



cOc1 c2c3c4c5 


c6 c7 a7 bO a6 b1 a5 b2 a4 b3 a3 54 a2 b5 a1 b6 aO b7 






cOc1 C2c3c4c5a7c6 


b0|c7|a6|b1 |a5|b2|a4|b3|a3|b4|a2|b5|a1 |b6|a0|b7 






cOc1 c2c3a7c4 


C5|c6|a6|b0|b1 |c7|a5|b2|a4|b3|a3|b4|a2|b5|a1 |b6|a0|b7 








cOc1 a7c2c3c4a6c5 


bO c6 a5 b1 b2 c7 a4 b3 a3 b4 a2 b5 a1 b6 aO b7 





|cO|c1 |c2|a6|c3|c4 |c5|a5|b0|b1 Ic6|a4|b2|b3|c7|a3|b4|a2|b5|a1|b6|a0|b7 

■ > I I I ^ 

Ic0|a6|cl|c2 |c3|a5|c4|b0|c5|a4|b1 Ib2|c6|a3|b3|b4|c7|a2|b5|a1 |b6|a0|b7 



IcO c1 a5 c2 c3 c4 a4 bO b1 c5 a3 b2 b3 c6 a2 b4 b5 c7 a1 b6 aO b7 



67 a6 a5 cO c1 c2 a4 c3 bO c4 a3 b1 b2 c5 a2 b3 b4 c6 a1 b5 b6 c7 aO b7 



|7|a6|a5|c0|a4|c1 |c2|c3|a3|b0|b1 |c4|a2|b2|b3|c5|a1 |b4|b5|c6|a0|b6|b7|c7 



|37|a6|a5|a4|c0|c1 |a3|c2|b0|c3|a2|b1 |b2|c4|a1 |b3|b4|c5|a0|b5|b6|c6|b7|c7 



|^7|a6|a5|a4|a3|c0|cl|c2^jb0|bl|c3Mb2|b3|c4|a0M 



a7 a6 a5 a4 a3 cO a2c1 bO c2 a1 b1 b2 c3 aO b3 b4 c4 b5 c5 b6 c6 b7 c7 



^7|a6|a5|a4|a3|a2|c0|c1 |a1 |bO|b1 Ic2|a0|b2|b3|c3|b4|c4|b5|c5|b6|c6|b7|c7 



|7|a6|a5|a4|a3|a2|al|c0|b0|clja(jbl|b2|c2|b3|c3|b4|c4|b5|c5|b6|c6|b7|c7 
Ia7|a6|a5|a4|a3|a2|al|c0|a0|b0|bl|cl|b2|c2|b3|c3|b4|c4|b5|c5|b6|c6|b7|c7 



Fig. 3. Qubit permutation stages for adaptation of the GF multiplication circuit the the LNN 
architecture, illustrated in the case GF(2^) and the computation of d. The arrows indicate which 
position do the individual qubits get swapped to. In particular, arrow \ indicates that the given 
qubit will not be used in the remainder of the computation, and it gets moved to the rightmost 
position. Triples of qubits highlighted light gray experience application of the Toffoli gates. Dark 
grey color is used to highlight qubits that are not used in the remainder of computation. 



circuit of depth 0{m?) which uses the extended Euchdean algorithm. 

By using projective coordinate representation, we can perform group operations without 
division. Instead of using two elements of GF{2"^) to represent a point, we use three elements, 
{X, Y, Z) to represent the point ^) in affine coordinates. Dividing X and F by a certain 
quantity is now equivalent to multiplying the third coordinate [Z) by this quantity. Extensions 
to this concept have also been explored, where different information about an elliptic curve 
point is stored in several coordinates. Another advantage to projective coordinates is that 
the point at infinity O can simply be represented by setting Z to zero. However, in order 
to retrieve the elliptic curve point in the affine representation, we still need to perform one 
multiplicative inversion at the end. 

To represent the point {X, 1"), we simply begin with the representation 

\P{X,Y)) = \X)\Y)\l). 

As we perform elliptic curve group operations, the third coordinate will not remain constant. 
Exact formulas for point addition in projective coordinates can be easily derived by taking 
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the formulas for the affine coordinates under a common denominator and muhiplying the Z 
coordinate by this denominator. These are detailed in [5 . Since the ECDLP can be solved by 
implementing elliptic curve point addition where one point is "classically known" [17j , we may 
implement these formulas using the multiplication algorithm presented in Section 13.11 and by 
being careful to uncompute any temporary registers used. Since the number of multiplication 
operations used in these formulas is fixed, we may implement elliptic curve point addition 
with a known classical point with a linear depth circuit. 

Finally, to construct the state required for solving the ECDLP, we use the standard "double 
and add" technique, which requires implementing the point addition circuit for each value 
2*P and 2^Q, where < i < m. Note that these points are classically known, so that at each 
step, we are performing point addition with one classically known point. When the final state 

is constructed, each \xP + yQ) will consist of three coordinates \X) \Y) \Z). Since the presence 
of the third coordinate Z will interfere with the discrete logarithm algorithm, we must revert 
to an affine coordinate representation. An algorithm to compute the multiplicative inverse 
of an element of GF{2"^) using an 0(m^)-depth circuit is given in [7]. Using |^^^), we 
may compute \XZ^^) \YZ^^), as required, before uncomputing \Z~^). Since \X) \Y) \Z) 
must now be uncomputed, this step must occur before any of the temporary registers used in 
computing them are themselves uncomputed. The result is the desired state 

in affine coordinates. As a final detail, we also need to address the 'point at infinity, O, which 
is the identity element of the elliptic curve group. In projective coordinates, O is represented 
by any {X, Y, Z) where Z ^ 0. In this case, we will not be able to perform multiplicative 
inversion on Z. However, since the ensuing quantum Fourier transform only requires that 
each point have a consistent representation, we may simply select the coordinates of a point 
which is known not to lie on the elliptic curve to represent O. The final registers can simply 
be set to these coordinates in the case that Z = 0. 

This represents an improvement on the algorithm of [7] , as multiplicative inversion is used 
only once, at the end of this algorithm, rather than at each elliptic curve point operation. In 
total, we perform 0(to) instances of the linear depth multiplication circuit, one instance of 
the 0(TO^)-depth multiplicative inversion circuit, and finally, a quantum Fourier transform. 
This gives a final depth complexity of 0{m?) for the circuit which solves the ECDLP over 
GF{2"^) in the LNN architecture. This improves the previously known upper bound of O(m^) 

4 Conclusion 

We considered the optimization of the quantum attack on the elliptic curve discrete logarithm 
problem, on which elliptic curve cryptography is based. Our constructions include a linear 
depth circuit for binary field multiplication and efficient data representation using projective 
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coordinates. Our main result is the depth O(m^) circuit executable in the LNN architecture 
for computing the discrete logarithm over elliptic curves over GF{2"^). Further research may 
be devoted toward a better optimization, further study of architectural implications, and the 
fault tolerance issues. 

Interestingly, our circuit is slightly (by a logarithmic factor) more efficient than the best 
known circuit for integer factoring optimized for the LNN architecture, allowing linear ancilla 
and assuming gates with diminishingiy small parameters cannot be used [8] . (We believe this 
is related to necessity of performing carry over during the integer addition, while it is not 
required for the addition over GF{2™).) However, our circuit reduces an exponential classical 
search to a polynomial time quantum, whereas integer factoring can be done classically with 
a subexponential time algorithm. Considering relative efficiency of ECC as compared to 
RSA, we suggest referring to the ability to solve ECDLP as a stronger practical argument for 
quantum computing. 
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